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(57) Abstract: The invention relates to a method, an apparatus 
and a computer program for encoding successive images. The 
method comprises encoding (818) an image block to be encoded 
using the motion vector candidate that gives the lowest cost func- 
tion value. Before encoding, the image is processed (804) into 
an indexed image and the reference image is indexed into an in- 
dexed reference image so that the image and the reference image 
are divided into parts referred to with indexes and a number is 
formed from the values of pixels in each part to describe pixel 
values in a given part; defining (814) a search area in the indexed 
reference image where the block to be encoded in the indexed 
image is searched for, and calculating (816) a cost function for 
each motion vector candidate using the indexed image and the 
indexed reference image. 
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Method, apparatus and computer for encoding successive im- 
ages 

FIELD 

[0001] The invention relates to a method, an apparatus and a com- 
5 puter for encoding successive images. 

BACKGROUND 

[0002] Encoding of successive images, e.g. video, is used to reduce 
the amount of data so that data can be stored more efficiently in a memory 
means or transmitted using a telecommunications connection. An example of a 

10 video encoding standard is MPEG-4 (Moving Images Expert Group). There are 
different image sizes in use, e.g. the cif size is 352 x 288 pixels and the qcif 
size 176 x 144 pixels. 

[0003] A single image is typically divided into blocks containing in- 
formation on luminance, colour and location. The data included in the blocks 

15 are compressed blockwise by a desired encoding method. The compression is 
based on deletion of less significant data. Compression methods are mainly 
divided into three classes: spectral redundancy reduction, spatial redundancy 
reduction and temporal redundancy reduction. Typically various combinations 
of these methods are used in compression. 

20 [0004] For example, a YUV colour model is used to reduce spectral 

redundancy. The YUV model utilizes the fact that the human eye is more sen- 
sitive to variations in luminance, i.e. light, than to variations in chrominance, i.e. 
colour. The YUV model includes one luminance component (Y) and two 
chrominance components (U and V, or C b and C r >. For example, a luminance 

25 block in accordance with the H.263 video encoding standard is 16 x 16 pixels 
and both chrominance blocks, which cover the same area as the luminance 
block, are 8 x 8 pixels. A combination of one luminance block and two chromi- 
nance blocks is called a macro block. Each pixel both in the luminance and in 
the chrominance block may receive a value from 0 to 255, i.e. eight bits are 

30 needed to present one pixel. For example, value 0 of the luminance pixel 
means black and value 255 white. 

[0005] Spatial redundancy is reduced using discrete cosine trans- 
form DCT, for instance. In discrete cosine transform, the pixel presentation of a 
block is transformed into a space frequency presentation. Furthermore, in an 

35 image block only the signal frequencies that appear in it have high-amplitude 
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factors, and the factors of the signals which do not appear in the block are 
close to zero. In principle, discrete cosine transform is a loss-free transform 
and interference is caused to the signal only in quantization. 

[0006] Temporal redundancy is reduced utilizing the fact that suc- 

5 cessive images usually resemble each other. Thus, instead of compressing 
each single image, motion data are generated on the blocks. This is called mo- 
tion compensation. A reference image stored in the memory earlier is searched 
for as good previously encoded block as possible for the block to be encoded, 
the motion between the reference block and the block to be encoded is mod- 

10 elled and calculated motion vectors are transmitted to the receiver. The differ- 
ence between the block to be encoded and the reference block is expressed 
as difference data. This kind of coding is known as inter-coding, which means 
utilization of similarities between images of the same image sequence. 

[0007] A search area where a block similar to the one in the image 

15 to be encoded is searched for is typically defined in the reference image. The 
best correspondence is found by calculating a cost function between the pixels 
between the block in the search area and the block to be encoded, e.g. a sum 
of absolute differences SAD. 

[0008] Motion estimation in the search area can be presented by 

20 the formula: 

where MV is the final motion vector, fxy is a pixel of the macro block 
25 to be encoded, and is a pixel of the reference image in the search area. 

[0009] In prior art solutions, one has used full search, i.e. all possi- 
ble or nearly all possible motion vectors have been set as motion vector candi- 
dates. A problem associated with the use of full search is that the number of 
calculations required is large. For example, if the size of the search area is 48 
30 x 48 pixels, the number of feasible motion vectors is 33 x 33 with an accuracy 
of one pixel, and the size of the luminance block is 16 x 16 pixels, 16x16 = 
256 calculations are needed to calculate one sum of absolute differences, and 
thus 33 x 33 x 256 = 278 784 calculations are needed to calculate the sums of 
absolute differences of all possible motion vectors per one macro block. An 
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image of the qcif size, for example, includes 99 macro blocks, i.e. the number 
of calculations needed is 99 x 278 784 = 27 599 616. 

[0010] One has tried to reduce the number of calculations using 
less comprehensive search methods instead of the full search, such as Three 
5 Step Search, Spiral Search and Hierarchical Motion Estimation, where deterio- 
ration of the image quality may cause problems. 

BRIEF DESCRIPTION 

[0011] An object of the invention is to provide an improved method, 
an improved apparatus and an improved computer program. One aspect of the 
10 invention provides a method of encoding successive images according to claim 
1. One aspect of the invention provides an apparatus according to claim 9 for 
encoding successive images. One aspect of the invention provides an appara- 
tus according to claim 17 for encoding successive images. One aspect of the 
invention provides a computer program according to claim 25 for encoding 
15 successive images. The other preferred embodiments of the invention are dis- 
closed in the dependent claims. 

[0012] The invention is based on performing motion estimation us- 
ing an indexed image and an indexed reference image. 

[0013] The solution according to the invention provides nearly the 
20 same image quality as the classical full search but with fewer calculations. 

LIST OF FIGURES 

[0014] The preferred embodiments of the invention will be described 
by examples with reference to the accompanying drawings, in which 

Figure 1 illustrates an apparatus for encoding successive images; 
25 Figure 2 illustrates division of a qcif-sized image into blocks; 

Figure 3 illustrates part of an image to be encoded; 

Figure 4 illustrates part of a reference image; 

Figures 5, 6 and 7 illustrate performance of indexing; 

Figure 8 is a flow chart illustrating a method of encoding successive 

30 images. 

DESCRIPTION OF EMBODIMENTS 

[0015] Video encoding is well known to a person skilled in the art 
from standards and textbooks, e.g. from the following works which are incorpo- 
rated herein by reference: Vasudev Bhaskaran and Konstantinos Konstanti- 
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nides: Image and Video Compressing Standards - Algorithms and Architec- 
tures, Second Edition; and Kluwer Academic Publishers 1997, Chapter 6: The 
MPEG video standards, and Digital Video Processing, Prentice Hall Signal 
Processing Series, Chapter 6: Block Based Methods. 

5 [0016] The successive images to be encoded are typically moving 

images, e.g. video. In the camera, a video image consists of individual succes- 
sive images. The camera forms a matrix which presents the images as pixels 
e.g. in the manner described above, where luminance and chrominance have 
separate matrixes. The data flow that presents the image as pixels is supplied 

10 to an encoder. It is also feasible to build a device where data flow is transmit- 
ted to the encoder along a data transmission connection, for example, or from 
the memory means of a computer. In that case the purpose is to compress an 
uncompressed video image with an encoder for forwarding or storage. The 
compressed video image formed by the encoder is transmitted along a channel 

15 to a decoder. In principle, the decoder performs the same functions as the en- 
coder when it forms an image but inversely. The channel may be, for example, 
a fixed or a wireless data transmission connection. The channel can also be 
interpreted as a transmission path which is used for storing the video image in 
a memory means, e.g. on a laserdisc, and by means of which the video image 

20 is read from the memory means and processed in the decoder. Encoding of 
other kind can also be performed on the compressed video image to be trans- 
mitted on the channel, e.g. channel coding by a channel coder. Channel cod- 
ing is decoded by a channel decoder. The encoder and decoder may be ar- 
ranged in different devices, such as computers, subscriber terminals of differ- 

25 ent radio systems, e.g. mobile stations, or in other devices where video is to be 
processed. The encoder and decoder can also be connected to the same de- 
vice, which can be called a video codec. The structure of a device for encoding 
successive images, i.e. an encoder, will be described with reference to Figure 
1. 

30 [0017] Figure 1 describes the function of the encoder on a theoreti- 

cal level. In practice, the structure of the encoder will be more complicated 
since a person skilled in the art adds necessary prior art features to it, such as 
timing and blockwise processing of images. Successive images 130 are sup- 
plied to a frame buffer 102 for temporary storage. A single image 132 is sup- 

35 plied from the frame buffer 1 02 to block 1 04, where the desired coding mode is 
selected. The function of the device is controlled by a control part 100, which 
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selects the desired coding mode and informs block 104 and block 120 of the 
selected coding mode 156, 158, for instance. The coding mode may be intra- 
coding or inter-coding. Motion compensation is not performed on an intra- 
coded image whereas an inter-coded image is compensated for motion. Usu- 

5 ally the first image is intra-coded and the following images are inter-coded. In- 
tra-images can also be transmitted after the first image if, for example, suffi- 
ciently good motion vectors are not found for the image to be encoded. 

[0018] In the following, the function of the apparatus will be de- 
scribed in a situation where intra-coding has been selected in block 104. 

10 [0019] Block 104 receives only the image 132 arriving from the 

frame buffer 102 as input for the intra-image. The image 132 obtained from the 
frame buffer 102 is supplied as such 134 to a discrete cosine transform block 
106 where the discrete cosine transform described at the beginning is per- 
formed. 

15 [0020] The image 136 on which discrete cosine transform has been 

performed is supplied to a quantization block 108, where quantization is per- 
formed, i.e. in principle each element of the image on which discrete cosine 
transform has been performed is divided by a constant and the result of the 
division is rounded to an integer. This constant may vary between different 

20 macro blocks. A quantization parameter, from which the divisors are calcu- 
lated, is typically between 1 and 31 . The more zeroes the block includes, the 
better it can be packed since zeroes are not transmitted to the channel. 

[0021] Then the quantisized image 138 on which discrete cosine 
transform has been performed is supplied to a variable length coder 110, 

25 which outputs the encoded image 140 produced by the device. 

[0022] In addition to the variable length coder 110, the quantized 
image 138 on which discrete cosine transform has been performed is taken 
from the quantization block 108 to an inverse quantization block 112, which 
performs inverse quantization on the input quantized image 138 on which dis- 

30 crete cosine transform has been performed, i.e. restores it to image 136 as 
accurately as possible. Then the image 142 quantized inversely is supplied to 
an inverse discrete cosine transform block 114, where inverse discrete cosine 
transform is performed. Since the discrete cosine transform is a loss-free 
transform and quantization is not, image 144 does not completely correspond 

35 to image 134. The purpose of inverse quantization and inverse discrete cosine 
transform is to produce an image in the encoder which is similar to the one 
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produced by the decoder corresponding to the encoding device. The 'decoded' 
image 144 is then supplied to block 124, where the part deleted from the im- 
age, i.e. difference data, would be added to it if the image had been inter- 
coded. Since the image in question is intra-coded, nothing is added to it. This 

5 decision is made by block 120, where intra-coding is the pre-selected option, in 
which case there is nothing in the input of block 120 and thus nothing is in- 
cluded in the output 154 connected to its block 124. After this, the intra-image 
146 is stored in the frame buffer 116. Thus a reconstructed image is stored in 
the frame buffer 116, i.e. the encoded image in the form in which it is after de- 

10 coding performed in the decoder. There are two frame buffers: the image arriv- 
ing at the device is stored in the first buffer 102 and the reconstructed 'previ- 
ous' image is stored in the second buffer 116. The above described how to 
process an image for which intra-coding had been selected in blocks 104 and 
120. 

15 [0023] One can start using motion compensation in the processing 

of the next image. 

[0024] In that case inter-coding is selected in blocks 104 and 120. 
The image 116 stored in the frame buffer is now a reference image and the 
image to be encoded is the image 132 to be obtained next from the frame 

20 buffer 102. As appears from Figure 1, the next image is supplied to a motion 
estimation block 118 in addition to block 104. The motion estimation block 118 
also receives a reference image 150 from the frame buffer 116. The function of 
the motion estimation block 118 will be described in greater detail below. At 
this point it is sufficient to note that the block searches the reference image for 

25 blocks corresponding to the blocks in the image to be encoded. Transitions 
between the blocks are expressed as motion vectors 152, 166, which are sup- 
plied both to the variable length coder and to the frame buffer 116. 

[0025] The reference image 148 is taken from the frame buffer 116 
to block 122. Block 122 subtracts the reference image 148 from the image 132 

30 to be encoded to provide difference data 164, which are supplied from block 
104 via the discrete cosine transform block 106 and quantization block 108 to 
the variable length coder 110. 

[0026] The variable length coder 110 encodes the difference data 
138 and motion vectors 166, in which case the output 140 of the variable 

35 length coder 110 provides an inter-coded image. The variable length coder 
110 receives as inputs the quantized difference data 138 on which discrete 
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cosine transform has been performed and motion vectors 166. The output 140 
of the encoder thus provides the inter-coded image with compressed data, 
which represent the encoded image and the encoded image in relation to the 
reference image by means of motion vectors and difference data. The motion 
5 estimation is carried out using the luminance blocks but the difference data to 
be encoded are calculated both for the luminance and the chrominance block. 

[0027] Inverse quantization is also performed on the difference data 
138 of the inter-coded image in the inverse quantization block 112 and inverse 
discrete cosine transform in the inverse discrete cosine transform block 114. 

10 The difference data 144 processed this way are supplied to block 124, where 
the previous image 154 subtracted in the encoding of the inter-image in ques- 
tion and obtained from the place indicated by the motion vector is added to the 
difference data. The sum 146 of the difference data and the previous image is 
supplied from block 124 to the frame buffer 116 to obtain a reconstructed im- 

15 age. The reconstructed image corresponds to the image obtained in the de- 
coder when the encoding of the inter-coded image 140 is decoded. Thus the 
frame buffer 116 has a reference image ready for encoding of the image 132 
received next from the frame buffer 102. 

[0028] The control block 100 controls the function of the encoder. In 

20 addition to selection of the coding mode, it controls selection 160 of the correct 
quantization ratio and performance 162 of encoding with a variable length, for 
instance. The control block 100 may also control other encoder blocks even 
though this is not illustrated in Figure 1. For example, the function of the mo- 
tion estimation block 118 is controlled by the control block 100. 

25 [0029] In the following, a method of encoding successive images 

will be described with reference to the flow chart shown in Figure 8. The 
encoding is presented expressly in respect of reduction of temporal 
redundancy and no other methods of redundancy reduction are described 
here. The method starts in block 800, where the encoder is started. In block 

30 802, the next image is retrieved from the frame memory. The image may be 
e.g. the qcif-sized image with 176 x 144 luminance pixels illustrated in Figure 
2. The image is divided into macro blocks 200 whose luminance parts have the 
size of 16 x 16 pixels. The macro blocks comprise eleven columns and nine 
rows. The luminance pixels can be presented by the matrix 

35 
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fu = 



fo t Q /o,l /o,2 
/l,0 fl % \ fl,2 
flfl fl,l f2,2 



8 

/o,175 

funs 
/2.175 



(2) 



/l43,0 /l43.1 /l43,2 — /l43,175. 

where i=0,1 ,2 143 and j=0,1 ,2 175. 

[0030] In block 804, the image is processed into an indexed image 
by dividing the image into parts referred to with indexes and forming a number 
from the values of pixels in each part to describe pixel values in the part con- 
cerned. 

[0031] In an embodiment, the pixels included in the part form a 
square because this is advantageous according to the experiments earned out 
by the applicant. The image size and block size set certain limits on the size of 
the indexed part used in the method. According to the applicant's experiments, 
it is advantageous in the case of a qcif-sized image that the part includes 4x4 
pixels. In that case matrix 2 can be presented as follows: 



(3) 





^0,0 




-^0,2 


^0,172 




F» 




*U 


F 




F 2,0 




F 2,2 


^2,172 






^140,1 


•^140,2 


— ^140,172 _ 



15 



where 1=0,1,2 140 and J=0,1,2 172. 

[0031] In addition to indexing, a number describing pixel values in a 
given part is formed from the values of pixels in each part. The simplest way to 
20 obtain this number is to add together the numerical values of the pixels in the 
part concerned. The number for each part of matrix 3 is obtained from matrix 2 
by the following formula: 

/+3 y+3 

Fu-YLf, < 4 » 

25 

[0032] For example, the number for the part referred to with index 
Fo,o is obtained as follows 
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3 3 



^,=11^. (5) 



1=0 j=0 



the number for the part referred to with index F 0( 1 is obtained as fol- 



5 lows 



3 4 



1=0 j=l 



the number for the part referred to with index Fi, 0 is obtained as fol- 



lows 



4 3 



*m-EZ/#. (7) 



W y=0 



follows 



and the number for the part referred to with index Fi.i is obtained as 



1=1 jm\ 



[0033] When matrixes 2 and 3 are compared, it can be noted that 
the parts referred to with indexes and located close to each other partly over- 

20 lap. This means that the areas referred to with indexes F 0 ,o and F 1i0 , for exam- 
ple, include twelve same pixels of matrix 2, i.e. pixels fi.o, fi,i, f-i.2, fi,3. f2,o. f2,i, 
f2,2, f2,3, f3,o, fa.1. fa,2 and f 3 , 3 . When the numbers for two adjacent parts referred 
to with indexes are calculated in an embodiment, the number already calcu- 
lated for the second part is utilized in the calculation of the number for the first 

25 part. This principle of sliding calculation can be described as follows. As stated 
above, the number for the part referred to with index F 0 ,o can be calculated 
using formula 5. In that case the other numbers are obtained in a sliding man- 
ner 



30 i ? 1> o=i 7 o.o-Z/o.;+Z/4.y.ja & 

(10) 

jmO y=0 
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[0034] The same principle can be applied to the calculation of all 
numbers, also when one proceeds from the left side of matrix 2 to its right side. 

[0035] After the image has been processed into an indexed image 
in block 804 according to the principles described above, one can move to 
5 block 806, where the coding mode to be used, i.e. intra-coding or inter-coding, 
is selected. 

[0036] If the selected coding mode is intra-coding, we proceed in 
accordance with arrow 808 from block 806 to block 810, where the image is 
encoded, i.e. discrete cosine transform and quantization are performed on it 

10 but no motion estimation. Then we move to block 826, where the intra-coded 
image is stored as a reference image. After this, the indexed image is stored in 
block 828 as an indexed reference image referred to with indexes. Further- 
more, a number is formed from the values of pixels in each part to describe 
pixel values in the part concerned. It should be noted that the fact that the im- 

15 age has been processed into an indexed image in block 804 is utilized here. It 
should also be noted that the operations of blocks 826 and 828 may be op- 
tional since the reference image does not necessarily need to be the image 
that immediately precedes the image to be encoded but it can also be an ear- 
lier image. 

20 [0037] From block 828 we move to block 830, where it is checked 

whether there are images to be encoded left. If there are no images left, we 
move in accordance with arrow 832 to block 834, where the method ends. If 
there are images left, we move in accordance with arrow 836 to block 802, 
where the next image is retrieved from the frame memory. 

25 [0038] If the coding mode selected in block 806 is inter-coding, we 

move in accordance with arrow 812 from block 806 to block 814, where a 
search area where the block to be encoded in the indexed image is searched 
for is defined for the indexed reference image. Inter-coding cannot thus be per- 
formed on the first image because it requires at least one reference image. In 

30 our example there has to be one intra-coded image, i.e. the operations of 
blocks 810, 826 and 828 have to have been performed on the reference image 
in question. In further phases, the reference image may naturally be an image 
which has been inter-coded earlier. 

[0039] Figures 3 and 4 illustrate how a search is performed. Figure 

35 3 illustrates part of an image 300 to be encoded and Figure 4 part of a refer- 
ence image 400. The parts 300, 400 illustrate the same point of the qcif-sized 
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image shown in Figure 2. The image 300 to be encoded thus consists of lumi- 
nance blocks with a size of 16 x 16 pixels. The size of the chrominance blocks 
is usually 8x8 pixels but they are not shown in Figures 3 and 4 because 
chrominance blocks are not utilized in motion estimation. It should be noted 
5 that, for the sake of clarity, Figures 3 and 4 describe the real content of the 
images without indexing. 

[0040] In our example it is assumed that the image includes nothing 
else but an image which fits exactly in one block 302 and consists of diagonal 
lines and letter H. A search area 402 is thus defined in the reference image 

10 400, which is the indexed reference image in our example. This area is 
searched for an image element included in the image to be encoded, i.e. the 
indexed encoded image in our example. The image element is located in block 
302. The search for motion vectors is usually limited to a search area 402 with 
a size of [-16, 16], in which case the search area 402 consists of nine blocks of 

15 16x16 pixels. The nine blocks of the search area 402 are located in the refer- 
ence image 400 in the manner shown in Figure 4 around the location of the 
block 302 to be encoded in the image to be encoded 300. The size of the 
search area 402 is thus 48 x 48 pixels. In that case the number of possible mo- 
tion vectors, i.e. motion vector candidates, is 33 x 33. 

20 [0041] After this, we move to block 816, where a cost function is 

calculated for each motion vector candidate using the indexed image and the 
indexed reference image. Figures 5, 6 and 7 illustrate indexing by describing 
part of the search area 402 of Figure 4 pixel-by-pixel 502. Figure 5 shows 
which ones of the areas 500 referred to with indexes are needed in the calcula- 

25 tion of motion vector candidate [0,0], i.e. the following elements of matrix 3 are 
used: 



F = 



^0,0 




F 

*0,8 


F 

^0,12 


^4,0 




^4.8 


F 

1 4,12 


^8,0 


^8,4 


F 


^8,12 


^12,0 




•^12,8 


F 



30 

[0042] The following elements of matrix 3 are needed to calculate 
motion vector candidate [1 ,0] according to Figure 6: 
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F = 







F 0 t 9 


^0.13 




^4,5 




^4.13 




^8.5 


^8,9 


-^8,13 


F 


F 


^12,9 


^12,13 _ 



(12) 



[0043] The numbers needed to calculate all possible motion vector 
candidates are obtained by the same principle from matrix 3. For example, ac- 
cording to Figure 7, the following elements of matrix are needed for motion 
vector candidate [4,0]: 



10 



F = 



^0,4 


^0,8 


F 


^0,16 


^4,4 


^4.8 


^4,12 


F All6 


*M 


F 


^8,12 


^8,16 


/*12.4 


F 

1 12,8 


F 

M2.12 


^12,16, 



(12) 



[0044] Motion estimation in the search area can be presented by 



the formula: 



'4+x,/*4+y| 



(13) 



1 5 where MV is the final motion vector, Fxy is the number calculated for 

the index of the number to be encoded, and Rxy is the number calculated for 
the index of the reference image. When formula 13 used in our method is 
compared with prior art formula 1 , it can be noted that motion estimation re- 
quires 33 x 33 x 4 x 4 = 17424 calculations in our method, i.e. 16 times fewer 

20 calculations than the prior art full search method. Calculation of indexes has 
not been taken into account here, but they need to be calculated only once for 
the whole image. In our example, the SAD function (Sum of Absolute Differ- 
ences) is used as the cost function but it is clear to a person skilled in the art 
that other cost functions may also be used if the image indexing method can 

25 be utilized in their calculation. 

[0045] In our example shown in Figures 3 and 4, a block 404 corre- 
sponding to the block 302 to be encoded in the (indexed) image 300 to be en- 
coded was found in the (indexed) reference image. Motion of the block 302 to 
be encoded with respect to the block 404 found in the reference image is ex- 
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pressed by a motion vector 406. The motion vector may be described as a mo- 
tion vector of the pixel of the leftmost upper corner in the block 302 to be en- 
coded, for instance. The other pixels of block 302 naturally also move in the 
direction of the motion vector concerned. 

5 [0046] The origin (0, 0) of the image is usually the pixel in the left 

upper corner in the image. In video encoding terminology motions are ex- 
pressed as follows: a motion to the right is positive, a motion to the left is nega- 
tive, a motion up is negative and a motion down is positive. The motion vector 
406 is (12, -4), i.e. the motion is twelve pixels to the right in the direction of the 

1 0 X axis and four pixels up in the direction of the Y axis. 

[0047] From block 816 we then move to block 818, where the image 
block to be encoded is encoded using the motion vector candidate that gives 
the lowest cost function value. The motion vector candidate thus defines the 
motion between the image block to be encoded and the candidate block in the 

1 5 search area of the reference image. 

[0048] Then we move to block 820, where it is tested whether 
blocks to be encoded are still left in the image to be encoded. If there are 
blocks to be encoded left, we move in accordance with arrow 822 to block 814, 
where the search for the block corresponding to the next block to be encoded 

20 starts in the reference image. The loop according to arrow 822 is repeated until 
the blocks of the image to be encoded have been processed in the desired 
manner, either all or some of them. 

[0049] If there are no more blocks to be encoded left, we move in 
accordance with arrow 824 first from block 820 to block 826 and then to block 

25 828, where, if desired, the image that was just encoded is stored as a refer- 
ence image and the indexed and encoded image is stored as an indexed ref- 
erence image. Then it is checked in block 830 whether there are still images to 
be encoded left. If there are no images to be encoded left, we move in accor- 
dance with arrow to block 834, where the method ends; otherwise the process 

30 continues in accordance with arrow 836 to block 802, where the next image to 
be encoded is searched for. 

[0050] In the method described in Figure 8, the processing of the 
image to be encoded into an indexed image is utilized so that the result of the 
same processing can also be stored as an indexed reference image. This re- 

35 duces the number of necessary calculations, but if the use of memory is to be 
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optimized, the indexed reference image can be calculated from the stored ref- 
erence image just before it is used. 

[0051] After the motion vector candidate giving the lowest cost func- 
tion value has been found with an accuracy of one pixel in an embodiment, the 

5 area around the motion vector candidate in question is searched for the best 
motion vector candidate with an accuracy of half a pixel. This is illustrated in 
Figure 8 with block 840. The location of the 16 x 16 block found in motion es- 
timation of half a pixel is still checked with an accuracy of half a pixel. Usually 
this requires an 18 x 18 matrix so that the pixels can be interpolated, but our 

1 0 method, which is based on indexes, enables the fact that the area to be inter- 
polated is a 6 x 6 matrix which employs indexes. A motion vector of half a pixel 
is obtained by applying formula 13, in which case the method also needs 16 
times fewer calculations that the traditional motion estimation with an accuracy 
of half a pixel. 

15 [0052] In an embodiment, the best motion vector candidate is 

searched for with an accuracy of half a pixel as follows: 

interpolating values of half a pixel for the indexed candidate block 
found, which corresponds to the one-pixel motion vector candidate in the ref- 
erence image, and around the block; 

20 calculating a cost function for each motion vector candidate of half a 

pixel using the indexed image and interpolated and indexed candidate block: 

encoding the image block to be encoded using the motion vector 
candidate of half a pixel that gives the lowest cost function value. 

[0053] An embodiment utilizes the well known fact that existing en- 

25 coding standards also allow motion vectors pointing outside the image. In the 
indexed motion estimation described, this is achieved by overfilling the index 
table so that there are 16 pixels on each edge of the image, i.e. at the top, at 
the bottom and on the sides, which have been copied there from the outer 
edges of the actual image area. This can be performed using block 842 of Fig- 

30 ure 8. The size of the indexed matrix 3 is the image size minus three, i.e. in our 
example 173 x 141. When the overfilling is taken into account, the image size 
is 173+32 x 141+32, or 176+29 x 144+29. Number 29 is naturally obtained by 
subtracting three from the space required by overfill, i.e. number 32. 

[0054] In the following, calculation of an indexed table and overfill- 

35 ing of indexes are described by a pseudo code which employs the syntax of 
the C programming language. 
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typedef struct { 
u8 *image; /* Pointer to image to be indexed */ 
u16 pels; /* Number of columns in image */ 
u16 lines; I* Number of rows in image */ 
5 u16*mv; /* Pointer to index table */ 

} index_s; 

void Melndex(index_s *vop) 
{ 

10 u16i,j; 

u16 pelsMv; 

u16*tmp = NULL; 

u16*tmp1 = NULL; 

u16*put = NULL; 
15 u16*start=NULL; 

u8 *block = NULL; 

/* Temporary memory of one index column */ 
tmp = (u16 *)malloc(sizeof(u1 6)*vop->lines); 

20 

/* Number of columns in index table, */ 

/* space required by overfill taken into account */ 

pelsMv = vop->pels+29; 

25 /* Pointer to first index, represents motion vector (0,0) */ 

start = vop->mv+pelsMv*16+16; 

/* Calculate first column of index table */ 
block = image; 
30 put = start; 

for 0 = 0; j < 4; j++) { 

tmpffl = block[0]+block[1]+block[2]+block[3]; 

block += vop->pels; 

} 

35 put[0] = tmp[0]+tmp[1]+tmp[2]+tmp[3]; 

tmp1 = tmp; 
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for 0 = 4; j < vop->lines; j++) { 
tmp1 [4] = block[0]+block[1 ]+block[2]+block[3]; 
put[pelsMv] = put[0]+tmp1[4]-tmp1[0]; 
block += vop->pels; tmp1 ++; put += pelsMv; 

} 

/* Calculate the rest of columns in index table */ 
for (i = 1 ; i < vop->pels-3; i++) { 
block = image+i; 
put = start+i; 
for 0 = 0; j < 4; j++) { 
tmp[j] = tmp[j]+block[3]-block[-1]; 
block += vop->pels; 

} 

put[0] = tmp[0]+tmp[1]+tmp[2]+tmp[3]; 
tmp1 = tmp; 

for (j = 4; j < vop->lines; j++) { 
tmp1 [4] = tmp1 [4]+block[3]-block[-1]; 
put[pelsMv] = put[0]+tmp1[4]-tmp1[0]; 
block += vop->pels; tmp1 ++; put += pelsMv; 

} 

} 

free(tmp); 

/* Overfill index table */ 
IndexOverfill(vop); 

return; 

} 

void lndexOverfill(index_s *vop) 
{ 

u16 i,j; 
u16 pelsMv; 

u1 6 *put = NULL; I* Help variable, pointer to index table */ 
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u1 6 *get = NULL; /* Help variable, pointer to index table */ 

/* Number of columns in index table, */ 
I* Space required by overfill taken into account */ 
5 pelsMv = vop->pels + 29; 

/* Overfill middle section of table from top */ 
put = vop->mv + 1 6; 
get = vop->mv + pelsMv*16 + 16; 
for 0 = 0; j < 16; j++){ 
for (i = 0; i < vop->pels; i++) { 
put[i] = get[i]; 

} 

put += pelsMv; 

} 

/* Overfill middle section of table from bottom */ 
put = vop->mv + (vop->lines+13)*pelsMv+16; 
get = vop->mv + (vop->lines+12)*pelsMv+16; 
forG = 0;j< 16;j++){ 

for (i = 0; i < vop->pels; i++) { 
put[i] = get[i]; 

} 

put += pelsMv; 

} 

/* Overfill table from left */ 
put = vop->mv; 
get = vop->mv+ 16; 
30 for (j = 0; j < vop->lines+29; j++) { 

for (i = 0;i< 16; i++){ 
put[i] = *get; 

} 

put += pelsMv; get += pelsMv; 

35 } 



10 



15 



20 



25 
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/* Overfill table from right */ 
put = vop->mv + vop->pels+13; 
get = vop->mv + vop->pels+12; 
for G = 0; j < vop~>lines+29; j++) { 
5 for(i = 0;i<16;i++){ 

put[i] = *get; 

} 

put += pelsMv; get += pelsMv; 

} 

10 

return; 

} 

[0055] The method described can be implemented using the en- 
15 coder shown in Figure 1 , for example. The encoder shown in Figure 1 , i.e. the 
apparatus for encoding successive images, comprises means 110 for encod- 
ing the image block to be encoded using the motion vector candidate that 
gives the lowest cost function value. The motion vector candidate defines the 
motion between the image block to be encoded and the candidate block in the 
20 search area of the reference image. The apparatus further comprises means 
118 for 

processing the image into an indexed image and the reference im- 
age into an indexed reference image so that the image and the reference im- 
age are divided into parts referred to with indexes and a number is formed 
25 from the values of pixels in each part to describe pixel values in a given part; 

defining a search area in the indexed reference image where the 
block to be encoded in the indexed image is searched for; and 

calculating a cost function for each motion vector candidate using 
the indexed image and the indexed reference image. 
30 [0056] The apparatus can also be configured to encode the image 

block to be encoded using the motion vector candidate that gives the lowest 
cost function value. The motion vector candidate defines the motion between 
the image block to be encoded and a candidate block in the search are of a 
reference image. In that case the apparatus is also configured to: 
35 process the image into an indexed image and the reference image 

into an indexed reference image so that the image and the reference image 
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are divided into parts referred to with indexes and a number is formed from the 
values of pixels in each part to describe pixel values in a given part; 

define a search area in the indexed reference image where the 
block to be encoded in the indexed image is searched for; and 

5 calculate a cost function for each motion vector candidate using the 

indexed image and the indexed reference image. 

[0057] The encoder blocks shown in Figure 1 can be implemented 
as one or more application-specific integrated circuits ASIC. Other embodi- 
ments are also feasible, such as a circuit built of separate logic components, or 

10 a processor with its software. A hybrid of these different embodiments is also 
feasible. When selecting the method of implementation, a person skilled in the 
art will consider the requirements set on the size and power consumption of 
the device, necessary processing capacity, production costs and production 
volumes, for example. The above-mentioned means can be placed in the en- 

15 coder blocks described or they can be implemented as new blocks related to 
the blocks described. For example, the means for processing an image into an 
indexed image and a reference image into an indexed reference image can be 
implemented in block 118 or in the frame buffer 102, or using a new block con- 
nected to the frame buffer 102. The device can also be configured using the 

20 described blocks or new blocks. One embodiment of the encoder is a com- 
puter program on a carrier for encoding successive images, comprising com- 
puter executable instructions for causing a computer to perform the encoding 
when the software is run. The earner can be any means for distributing the 
software to the customers. The earner can be a distribution package (contain- 

25 ing a diskette, CD-ROM or another computer readable medium for storing the 
software), a computer memory (for example a programmed memory chip or 
another memory device connectable to the computer), a telecommunications 
signal (for example a signal transferred in the Internet and/or in a cellular radio 
network containing the software in normal or compressed format). The encoder 

30 may also be part of a complete video codec. 

[0058] Even though the invention was described above with refer- 
ence to an example according to the accompanying drawings, it is clear that 
the invention is not limited thereto but it may be modified in various ways within 
the inventive concept disclosed in the appended claims. Thus the size of im- 

35 ages to be processed may differ from the qcif size used in the example. This 
does not significantly change the implementation of the invention. 
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CLAIMS 

1 . A method of encoding successive images, comprising the follow- 
ing steps: 

encoding (818) an image block to be encoded using the motion vec- 
5 tor candidate that gives the lowest cost function value, the motion vector can- 
didate defining the motion between the image block to be encoded and a can- 
didate block in the search area of a reference image; 

characterized by prior to encoding: 

processing (804) the image into an indexed image and the refer- 
10 ence image into an indexed reference image so that the image and the refer- 
ence image are divided into parts referred to with indexes and a number is 
formed from the values of pixels in each part to describe pixel values in a given 
part; 

defining (814) a search area in the reference image where the block 
15 to be to be encoded in the indexed image is searched for; and 

calculating (816) a cost function for each motion vector candidate 
using the indexed image and the indexed reference image. 

2. A method according to claim 1, characterized in that the 
pixels included in the part form a square. 

20 3. A method according to claim 2, characterized in that the 

part includes 4x4 pixels. 

4. A method according to any one of the preceding claims, char- 
acterizedin that a SAD function (Sum of Absolute Differences) is used as 
the cost function. 

25 5. A method according to any one of the preceding claims, char- 

acterized in that parts located close to each other and referred to with 
indexes partly overlap. 

6. A method according to claim 5, characterized in that 
when numbers are calculated for two adjacent parts referred to with indexes, 

30 the number calculated for the second part is utilized in the calculation of the 
number for the first part. 

7. A method according to any one of the preceding claims, char- 
acterized in that after the motion vector candidate giving the lowest cost 
function value has been found with an accuracy of one pixel, the best motion 
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vector candidate is searched for with an accuracy of half a pixel around the 
motion vector candidate concerned. 

8. A method according to claim 7, characterized in that the 
best motion vector candidate with an accuracy of half a pixel is searched for as 

5 follows: 

interpolating values of half a pixel for the indexed candidate block 
found, which corresponds to the one-pixel motion vector candidate in the ref- 
erence image, and around the block; 

calculating a cost function for each motion vector candidate of half a 
10 pixel using the indexed image and interpolated and indexed candidate block: 

encoding the image block to be encoded using the motion vector 
candidate of half a pixel that gives the lowest cost function value. 

9. An apparatus for encoding successive images, comprising: 
means (110) for encoding an image block to be encoded using the 

15 motion vector candidate that gives the lowest cost function value, the motion 
vector candidate defining the motion between the image block to be encoded 
and a candidate block in the search area of a reference image; 

characterized in that the apparatus further comprises: 
means (118) for processing the image into an indexed image and 
20 the reference image into an indexed reference image so that the image and 
the reference image are divided into parts referred to with indexes and a num- 
ber is formed from the values of pixels in each part to describe pixel values of 
a given part; 

means (118) for defining a search area in the indexed reference im- 
25 age where the block to be encoded in the indexed image is searched for; and 
means (118) for calculating a cost function for each motion vector 
candidate using the indexed image and the indexed reference image. 

10. An apparatus according to claim 9, characterized in that 
the pixels included in the part form a square. 

30 1 1 . An apparatus according to claim 10, characterized in 

that the part includes 4x4 pixels. 

1 2. An apparatus according to any one of claims 9 to 1 1 , c h a r - 
acterized in that an SAD function (Sum of Absolute Differences) is used 
as the cost function. 
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13. An apparatus according to any one of claims 9 to 12, c h a r - 
acterized in that parts located close to each other and referred to with 
indexes partly overlap. 

14. An apparatus according to claim 13, characterized in 
5 that when numbers are calculated for two adjacent parts referred to with in- 
dexes, the number calculated for the second part is utilized in the calculation of 
a number for the first part. 

1 5. An apparatus according to any one of preceding claims 9 to 14, 
characterized in that after the motion vector candidate giving the low- 

10 est cost function value has been found with an accuracy of one pixel, the best 
motion vector candidate is searched for with an accuracy of half a pixel around 
the motion vector candidate in question. 

16. An apparatus according to claim 15, characterized in 
that the best motion vector candidate with an accuracy of half a pixel is 

1 5 searched for as follows: 

interpolating values of half a pixel for the indexed candidate block 
found, which corresponds to the one-pixel motion vector candidate in the ref- 
erence image, and around the block; 

calculating a cost function for each motion vector candidate of half a 
20 pixel using the indexed image and interpolated and indexed candidate block: 

encoding the image block to be encoded using the motion vector 
candidate of half a pixel that gives the lowest cost function value. 

17. An apparatus for encoding successive images which is config- 
ured to: 

25 encode an image block to be encoded using the motion vector can- 

didate that gives the lowest cost function value, the motion vector candidate 
defining the motion between the image block to be encoded and a candidate 
block in the search area of a reference block; 

characterized in that the apparatus is further configured to: 

30 process the image into an indexed image and the reference image 

into an indexed reference image so that the image and the reference image 
are divided into parts referred to with indexes and a number is formed from the 
values of pixels in each part to describe pixel values in a given part; 

define a search area in the indexed reference image where the 

35 block to be encoded in the indexed image is searched for; and 
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calculate a cost function for each motion vector candidate using the 
indexed image and the indexed reference image. 

18. An apparatus according to claim 17, characterized in 
that the pixels included in the part form a square. 
5 19. An apparatus according to claim 18, characterized in 

that the part includes 4x4 pixels. 

20. An apparatus according to any one of preceding claims 17 to 

19, c h a r a c t e r i z e d in that the apparatus is configured to use a SAD 
function (Sum of Absolute Differences) as the cost function. 

10 21. An apparatus according to any one of preceding claims 17 to 

20, characterized in that the parts located close to each other and re- 
ferred to with indexes partly overlap. 

22. An apparatus according to claim 21, characterized in 
that when numbers are calculated for two adjacent parts referred to with in- 

1 5 dexes, the apparatus is configured to utilize the number calculated for the sec- 
ond part in the calculation of a number for the first part. 

23. An apparatus according to any one of preceding claims 17 to 
22, c h a r a c t e r i z e d in that after the motion vector candidate giving the 
lowest cost function value has been found with an accuracy of one pixel, the 

20 apparatus is configured to search for the best motion vector candidate with an 
accuracy of half a pixel around the motion vector candidate in question. 

24. An apparatus according to claim 23, characterized in 
that the apparatus is configured to search the best motion vector candidate 
with an accuracy of half a pixel as follows: 

25 interpolating values of half a pixel for the indexed candidate block 

found, which corresponds to the one-pixel motion vector candidate in the ref- 
erence image, and around the block; 

calculating a cost function for each motion vector candidate of half a 
pixel using the indexed image and interpolated and indexed candidate block: 

30 encoding the image block to be encoded using the motion vector 

candidate of half a pixel that gives the lowest cost function value. 

25. A computer program on a carrier for encoding successive im- 
ages and comprising computer executable instructions for causing a computer 
to: 

35 encode an image block to be encoded using the motion vector can- 

didate that gives the lowest cost function value, the motion vector candidate 
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defining the motion between the image block to be encoded and a candidate 
block in the search area of a reference image; 

characterized by the computer program further comprising 
computer executable instructions for causing a computer prior to encoding to: 
5 process the image into an indexed image and the reference image 

into an indexed reference image so that the image and the reference image 
are divided into parts referred to with indexes and a number is formed from the 
values of pixels in each part to describe pixel values in a given part; 

define a search area in the reference image where the block to be to 
1 0 be encoded in the indexed image is searched for; and 

calculate a cost function for each motion vector candidate using the 
indexed image and the indexed reference image. 

26. A computer program according to claim 25, character- 
ize d in that the pixels included in the part form a square. 
15 27. A computer program according to claim 26, character- 

ize d in that the part includes 4x4 pixels. 

28. A computer program according to any one of the preceding 
claims 25-27, characterized in that a SAD function (Sum of Absolute 
Differences) is used as the cost function. 
20 29. A computer program according to any one of the preceding 

claims 25-28, characterized in that parts located close to each other 
and referred to with indexes partly overlap. 

30. A computer program according to claim 29, character- 
ized in that when numbers are calculated for two adjacent parts referred to 

25 with indexes, the number calculated for the second part is utilized in the calcu- 
lation of the number for the first part. 

31. A computer program according to any one of the preceding 
claims 25-30, characterized in that after the motion vector candidate 
giving the lowest cost function value has been found with an accuracy of one 

30 pixel, the best motion vector candidate is searched for with an accuracy of half 
a pixel around the motion vector candidate concerned. 

32. A computer program according to claim 31, character- 
ized in that the computer program further comprises computer executable 
instructions for causing the computer to search for the best motion vector can- 

35 didate with an accuracy of half a pixel as follows: 
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interpolate values of half a pixel for the indexed candidate block 
found, which corresponds to the one-pixel motion vector candidate in the ref- 
erence image, and around the block; 

calculate a cost function for each motion vector candidate of half a 
5 pixel using the indexed image and interpolated and indexed candidate block; 
and 

encode the image block to be encoded using the motion vector 
candidate of half a pixel that gives the lowest cost function value. 

33. A computer program according to any one of the preceding 
10 claims 25-32, characterized in that the carrier is at least one of a distri- 
bution package, a computer memory and a telecommunications signal. 
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